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FROM THE VERY CORE OF APPLE: 



Applesoft Internal 
Structure 



By C.K. Mesztenyi/Washington Apple Pi 




INTRODUCTION 

71 HIS article attempts to 
describe the overall structure 
of Applesoft in the ROM space 
$D000-F7FF; it may be con- 
sidered as a preceding chapter to 
"Applesoft Internals," hereinafter 
referred to as "Crossley", and gives 
descriptions of many subroutines and 
zero page usage. Crossley and other 
abbreviated references, known herein 
by Lingwood, Mestztenyi and Golding, 
may be found at the conclusion of this 
article. 

Before going into details, I must 
define certain terms for the sake of this 
article which may be very confusing in 
the Applesoft Manual. These terms 
are the "statement," "command," "in- 
struction," "line number" and "line." 
The first three of these are used 
somewhat interchangeably in the 
Manual. It refers to REM and Assign- 
ment or LET statements in Chapter 1, 
lists them as Commands together with 
ABS in Appendix 0, and assumes 
them to be instructions in Chapter 2 
and Appendix N. I do not intend to 
clear all these confusions and errors in 
the syntactic definition and subse- 
quently used terminology, instead, the 
following syntactic definitions will be 
used here with the hope that I will not 
confuse the issue further. These defini- 
tions are as follows: 

statement : = end-st / for- 

st// ... /new-st 

let-st : = assign-st / 

LET assign-st 

compound-statement: = statement 

[CR1 statement : 

labeled-statement := linenumber 
compound- 
statement 



For example, I define a 
"statement" as any of the 64 state- 
ments with the keyword "end," 
"for,"... as listed in the keyword col- 
umn of the Statement Type Entry 
Table; the syntactic rules of these in- 
dividual statements are given in the 
Manual under their descriptions. The 
compound-statement is a list of [sim- 
ple] statements separated by a ":", 
while the labeled-statement is a line 
number followed by the compound- 
statement which the Manual defined 
as "line." [CR] stands for carriage 
return. 

With these definitions, one can 
state that a compound-statement is a 
program in immediate mode, while a 
labeled-statement is a program part in 
deferred mode. 

1. DATA STRUCTURE 

The data areas used by Applesoft 
reside: 

1. Flags and temporaries on Zero 
page. 

2. Five Tables in memory 
SD000-D364. 

3. Scattered (locally used) data 
interspersed in the program 
area $D365-F7FF. 

4. Zero page load data in 
memory $F10B-F126. 

5. Stored program normally from 
memory address $0801. 

6. Variable areas. 

1.1 Zero Page 

The zero page use is described in 
(Applesoft, Basic Programming 
Reference Manual pp. 140- 141). Fur- 
ther information may be found in 
[Crossley], [Mesztenyi], and 
[Lingwood]. 



1.2 Tables 

The five tables residing in 
$D000-D364 are as follows: 

$D000-D07F = Statement Type 

Entry Table. 
$D080-D0B1 = Function Entry 

Table. 
$D0B2- D0CF=Operator Tag and 

Entry Table. 
$D0D0- D25F=Keyword Token 

Table. 
$D260-D364 = ASCII Messages. 
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APPLESOFT INTERNAL STRUCTURE 



The Statement Type Entry Table 
is used to recognize statements and to 
obtain the proper entry points in the 
program area. It consists of 64 two 
byte entries containing the entry point 
low-high addresses minus one. The 
order of the 64 entries correspond to 



the tokens, 128 to 191, assigned to the 
keywords END to NEW, as given in 
(Applesoft, Basic Programming 
Reference Manual, p. 121). Table 1 
summarizes these data, giving the ac- 
tual entry point addresses. 



TABLE 1 










Statement Type Entry Table From $D00-O07F 




Hex 


Key 


Entry 


Hex 


Key 


Entry 


Token 


Void 


Point 


Token Word 


Po i nt 


$80 


END 


SD870 


SAO 


COLOR= 


SF24F 


s a i 


FOR 


5D764 


$A1 


POP 


SD94B 


(82 


NEXT 


SDCF9 


$A2 


VTAB 


$F256 


$83 


DATA 


$D995 


$A3 


HIMEM: 


SF2B6 


$8 4 


INPUT 


$DB82 


$A4 


LOMEM: 


SF2A6 


585 


DEL 


SF331 


5A5 


ONERR 


SF2CB 


$86 


DIM 


SDFD9 


$A6 


RESUME 


SF318 


$87 


READ 


SDBE2 


$A7 


RECALL 


SF3BC 


$88 


CR 


SF390 


SAB 


STORE 


SF39F 


$89 


TEXT 


SF399 


$A9 


SPEED= 


SF262 


$8A 


PR* 


SF1E5 


$AA 


LET 


SDA46 


$8B 


INK 


$F1DE 


SAB 


GOTO 


SD93E 


$8C 


CALL 


SF1D5 


SAC 


RUN 


SD912 


$8D 


PLOT 


$F22S 


SAD 


IF 


SD9C9 


$8E 


KLIN 


SF232 


SAE 


RESTORE 


$DB49 


S8F 


VLIN 


SF241 


$AF 


S 


S03F5 


$90 


KGH2 


SF3D8 


SBO 


GOUB 


SD921 


$?1 


HCR 


$F3E2 


$B1 


RETURN 


SD96B 


$92 


HCOLORs 


SF6E9 


SB2 


REM 


SD9DC 


$93 


HPLOT 


$F6FE 


$B3 


STOP 


SD86E 


$94 


DRAW 


$F769 


SB4 


ON 


$D9EC 


$95 


XDRAV 


$F76F 


SB5 


WAIT 


SE784 


$96 


HTAE 


SF7E7 


SB6 


LOAD 


SD8C9 


$97 


HOME 


$FC5B 


$B7 


SAVE 


SD8B0 


$98 


HOT= 


$F721 


SB8 


DEF 


SE313 


$99 


SCALEs 


SF727 


$B9 


POKE 


SE77B 


$9A 


SHLOAD 


$F775 


$BA 


PRINT 


$DADS 


$98 


TRACE 


$F2«D 


$BB 


CONT 


SDB96 


$9C 


NOTRACE 


$F26F 


SBC 


LIST 


$D6AS 


$9D 


NORMAL 


$F273 


$BD 


CLEAR 


SD66A 


$9E 


INVERSE 


$F277 


$BE 


GET 


SDBAO 


$9F 


FLASH 


SF280 


$BF 


NEW 


SD649 


TABLE 2 




& 






Function Entry Table From SD080-D08" 






Hex 


Xey 


Entry 


Hex 


Key 


Entry 


Token 


Word 


Point 


Token 


Word 


Point 


$D2 


SGN 


$EB90 


SDF 


SIN 


SEFFt 


$D3 


INT 


$EC23 


SEO 


TAN 


SF03A 


$D4 


ABS 


$EBAF 


SE1 


ATN 


SF09E 


$D5 


USR 


$000A 


$E2 


PEEK 


SE764 


$D6 


FRE 


SE2DE 


SE3 


LEN 


SE6D6 


$D7 


SCRNC 


SD412 


$E4 


STRS 


SE3C5 


$DB 


PDL 


SDFCD 


SE5 


VAL 


SE707 


$D9 


POS 


$E2FF 


SE6 


ASC 


SE6E5 


$DA 


SQR 


SEE8D 


$E7 


CHRS 


SE646 


$DB 


RND 


SEFAE 


$E» 


LEFTS 


SE6SA 


$DC 


LOG 


$E941 


$E9 


RIGHTS 


SE686 


$DD 


EXP 


$EF09 


SEA 


MIDS 


SE691 


$DE 


COS 


SEFEA 
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The Function Entry Table is used 
during expression evaluation to obtain 
entry points to the function 
subroutines in the program area. It 
consists of 25 two byte entries with 
low-high addresses. The order of the 
entries corresponds to the tokens 210 
to 234 assigned to the keywords SGN 
to MID$ as given in (Applesoft, Basic 
Programming Reference Manual, 
p. 121). Table 2 gives the summary. 
The description of the function 
subroutines with their entry points are 
given in [Crossley]. 

The Operator Tag and Entry Table 
is used during expression evaluation. It 
consists of 10 three-byte entries corre- 
sponding to the tokens 200 to 209 
assigned to the keywords + to x as 
given in (Applesoft, Basic Programm- 
ing Reference Manual, p. 121.) Of 
these three bytes, the first byte con- 
tains the Tag which also serves as a 
precedence number. The next two 
bytes contain the low-high addresses 
minus one of the entry points in the 
program area. Table 3 shows the Tag 
values and actual entry point 
addresses. 



TABLE 3 






Operator TAG and Entry Table 


From SD0B2-D0CF 




Hex 


Key 


Hex 


Ent ry 


Token 


Word 


Tag 


Po int 


SC8 


+ 


S79 


SE7C1 


$C9 


.. 


$79 


SE7AA 


$CA 


< 


$7B 


SEVB2 


$CB 


/ 


«7B 


SEA49 


see 


A 


S7D 


SEE97 


$CD 


AND 


$50 


SDF5S 


$CE 


OR 


$46 


SDF4F 


$CF 


> 


$7F 


SEEDO 


SD0 


- 


$7F 


SDE98 


$D1 


< 


$64 


SDF45 



The Keyword Token Table is used 
by the Tokenizer routine which 
replaces keywords by appropriate 
tokens. It consists of the 107 keywords 
(from END to MID$) concatenated 
such that each byte is an ASCII 
character with high bit set to zero, 
unless the character is the last one of a 
keyword, in which case it is set to 1. 
e.g. it contains 

ENDFORNEXT... 

where the bold character indicates that 
the high bit is one. 
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The ASCII Message Table con- 
tains ASCII characters where the in- 
dividual message (e.g. the error 
message part "SYNTAX ERROR") is 
separated either by having the high bit 
set to its last character byte, or fol- 
lowed by a zero byte. 

1.3 Scattered Data 

Scattered data may occur in many 
places; some of them are the floating 
point constants, (see: [Crossley] and 
[Lingwood]), short table for high 
resolution graphics, (see: [Mesztenyi]). 

1.4 Zero Page Load Data 

The memory area $F10B-F126 is 
the CHRGET/CHRGOT routine 
followed by an initial random number 
which gets loaded into the Zero page 
$B1-CC during initialization. 

1.5 Stored Program Area 

Zero page location $67-68 contain 
the address (low-high) of the beginning 
of the stored program, usually $0801. 
From this address, the memory con- 
tains the tokenized label-statements 
ordered by their line numbers. The for- 
mat of a tokenized label-statement is as 
follows: 

2-byte pointer (low-high address) 
to the next tokenized 
statement 

2-byte binary value (low-high) of 
the line number bytes of 
the tokenized compound- 
statement 

n bytes of the actual tokenized 
compound statement 

1-byte containing zero 

The last tokenized labeled- 
statement is followed by two extra 
bytes containing zero. Thus the stored 
program has a chain of pointers start- 
ing with the contents of $67-68, and 
ending with a zero value. Each pointer 
indicates the beginning of a labeled- 
statement, while a byte containing zero 
indicates its end; three zero bytes in- 
dicate the end of the stored program. 

1.6 Variable Areas 

These areas and corresponding 
pointers are adequately described in 
(Applesoft, Basic Programming Refer- 
ence Manual), with further explana- 
tions in [Golding]. 

Call —APPLE. January 1982 



2. CHRGET/CHRGOT 
SUBROUTINE. 

The most important subroutine in 
Applesoft is the CHRGET/CHRGOT 
subroutine residing on the Zero page 
$B1-C8 with the TXTPTR imbedded 
at $B8-$B9. It has been described in 
[Crossley], but is repeated here 
because of its importance. 

The CHRGOT entry ($B7) loads 
the register A with the contents of the 
memory whose address is in the 
TXTPTR ($B8-B9, low-high). 
CHRGET entry (Bl) does the same 
except it increments the TXTPTR 
prior to loading. If the obtained byte is 
equal to the ASCII space ($20) then the 
control goes back to CHRGET, i.e. 
spaces (blanks) are skipped. Otherwise 
the flag Z is set if A=$3A or $00, i.e. 
ASCII colon (:) or null; flag C is set if A 
is not an ASCII number to 9, i.e. 
A<$30 or A>$39; finally the control 
goes back to the calling routine. 

The importance of this routine 
comes into light if one compares it to an 
instruction fetch cycle in a computer 
with the TXTPTR as a counter 
register. The instruction code is 
returned in register A, flags Z and C, 
ready to be executed (interpreted). The 
ASCII space code behaves like a no-op, 
and is automatically skipped. This 
feature is realized in the implementa- 
tion of GOSUB-and RETURN-state- 
ments by placing the TXTPTR value 
together with line-number and tag $B0 
on the stack in the GOSUB-statement, 
resetting them in the RETURN- 
statement. 

Unfortunately, the CALL-state- 
ment has been implemented differ- 
ently by not saving the above data in 
the stack. It would have been simple to 
implement in the same way as the 
GOSUB-statement, and the RETURN 
-statement could have served as a 
return address from the machine 
language subroutine. This would have 
allowed a call of the Applesoft routine 
at $D43C with a CALL-statement from 
a stored program with request for in- 
put of a compound-statement ending 
with RETURN ready to be executed in 
immediate mode, where the RETURN 
causes the return to the stored 
program. 



3. PROGRAM STRUCTURE 

The overall program structure of 
Applesoft can be illustrated by the 
following semantic program: 

3.1. Initialization 

3.2. Request and receive input 
from the" keyboard. 

3.3. Tokenize the input 

3.4. If the first character of the in- 
put is an ASCII number then 
store the input as part of the 
stored program, and GOTO 
3.2. 

3.5. If the first character of the in- 
put is not an ASCII number 
then execute the input as a 
program, after which GOTO 
3.2. 

3. 1 Initialization 

The Initialization (starting at 
$F128) sets up the Zero page and 
various other pointers. 

3.2 Input 

The input request starts at $D43C. 
It uses the subroutine at $D52E to 
display the prompt symbol and 
through the Monitor GETLN, to 
receive the input line into the input 
buffer at $0200. It sets the high bits of 
the input data to zero, places a zero 
byte after the last input character, and 
initializes the TXTPTR to the input 
buffer address minus one. 

3.3 Tokenization 

The Tokenization Subroutine 
($D559-D619, with entry at $D559) 
replaces the keywords with the appro- 
priate tokens in the input buffer. It also 
removes blanks with the result still in 
the input buffer. It places two extra 
zero bytes at the end of the line. No 
syntax checking is performed by this 
routine. 

Following Tokenization, the first 
character in the input buffer decides 
whether 3.4 or 3.5 is to be executed. 
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APPLESOFT INTERNAL STRUCTURE 



3.4 Stored Program 

If the first character in the input 
buffer is an ASCII number, then 
Applesoft assumes it to be the first 
character of a line-number of a labeled- 
statement and either inserts it or 
replaces an old labeled-statement with 
the same line-number in the stored pro- 
gram with the help of the routine start- 
ing at $D46A. 

3.5 Execution 

If the first character of the input is 
not an ASCII number, then Applesoft 
assumes the input to be a compound- 
statement ready to be executed. It sets 
the TXTPTR to the beginning of the 
input buffer and enters into an execu- 
tion loop at $D805. At this stage, 
TXTPTR really behaves like a pro- 
gram counter. The execution of a 
statement advances or changes 
TXTPTR, e.g. to the stored program. 
Finally, the control returns to 3.2, re- 
questing new input under the following 
condition: 



(i) Execution of an end- or stop- 
statement 

(ii) Encountering 3 consecutive zero 
bytes 

(iii) Detecting syntax error without an 
onerr-statement. 

Individual statements are recog- 
nized by their first, possibly tokenized, 
byte. If this is between $80 and $BF 
then it is assumed to be a token, and 
the statement is executed by jumping 
to the appropriate entry point listed in 
Table 1. Otherwise it is assumed to be 
a let-statement without the word LET. 
These statement execution routines 
are called subroutines, but not all of 
them return. 

The execution loop in $D805-D848, 
and its preceding section in $D7D2- 
D894, is fairly complex. It is listed 
below with appropriate remarks. 

CONCLUSION 

With the knowledge of the Data 
Structure, one may trace the internal 
workings of Applesoft based on the 
five point (3.1 to 3.5) Program Struc- 
ture, and on the 64 statement inter- 
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preter subroutines with the given entry 
points. There are two difficult parts 
which need further documentation: 
1. The expression evaluation routine, 
called by FRMEVL in [Crossley], 
which is used by many statement 
routines. I think that part of the 
complication is because Applesoft 
had been implemented before its 
syntactic rules were (correctly?) 
established. 
2. The other difficulty lies in the multi- 
ple use of the stack. Beside the 
statement subroutines (GOSUB-, 
RETURN-, CALL-, FOR- and 
NEXT-statement), FRMEVL uses 
it, and also the internal program in 
Applesoft (JSR, RTS instructions). 
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Statement Handler Routine $D7D2-$D804 






NEUSTT TSX ;Save 






STX SF8 


St ackpo inter 






JSR SD858 


Checks for Ctrl C 






LDA $B£ 


Get 






LDY $B9 


TXTPTR 






LDX S74 


Check if immediate mode 






INX 


($FF in current line nbr) 






BEQ m 






5TA $7 9 


No, thus put TXTPTR into 






STY $7A 


Old TXTPTR 






Nl LDY #$00 


Check byt* at TXTPTR 






LDA <$BB>,Y 






BNE COLON 


If non-xe j then it should be ':' 






LDY *$02 


If zero t .en end of compound-st. 






LDA (SB8) , Y 


Check for end of program 






CLC 


Zero pointer 2 bytes further 






BEQ PRENO 






INY ;It is a new labeled-statement 






LDA (SB8),Y ;C«t and store new 






STA $75 ;Current line nbr 






INY 






LDA ($B8),Y 






STX $74 






TYA ; Update TXTPTR 






ADC SBS 






STA $B8 






BCC EXECUTE 






INC $B9 






EXECUTE BIT $F2 ;Cheok for the tiace bit 






BPL LI ;Notraoe if positive 






LDX $76 ;Tiace is on, check 






INX ;For mode 






BEQ LI ;No print in immediate mode 






LDA #$23 iPrint out line nbr 






JSR $DB5C ;As ttace information 






LDI $75 






LDA $76 






JSR SED24 






JSR SDB57 






LI JSR CHRGET ;Get fiist byte of statement 






JSR STYPE ;Use JSR to get return address in 






; s t ack for 






STTRET JHP NEWSTT ; <--s tat ement execution subroutine 






.returns here 






PREND BEQ $D8BA ; End of program 






STYPE BEQ SD857 ;State»ent type check on its first byte 






SBC #$80 






BCC ASGST ; < $ 80 then assign-statement 






CMP #$40 






BCS $0846 ;>$BF then error 






ASL ;Otherwise get 






TAY ;Entry point 






LDA $D001,Y ;From the 2-byte 






PHA ;Statement-type table 






LDA SDOOO.Y ; And pat it into stack 






PHA ;As return address of CHRGET 






JMP CHRGET ;And go to there 






ASGST JMP $DA46 ;Co to LET-st . routine 






COLON CMP »$3A ;Check for colon 






BEQ EXECUTE ;Yes, go to eiecute 






JMP SDEC9 ;Otherwise error 






Addresses: NEVSST SD7D2 Nl $D7E5 






EXECUTE SD805 LI $DB1D 






STTRET SD823 PREND SDB26 






STTYPE $D82B ASGST SDB3F 






COLON SD8 42 
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